Ensemble-Based Imputation for Genomic Selection: an Application to Angus Cattle

نویسندگان

  • Chuanyu Sun
  • Xiao-Lin Wu
  • Kent A. Weigel
  • Guilherme J.M. Rosa
  • Stewart Bauck
  • Brent W. Woodward
  • Robert D. Schnabel
  • Jeremy F. Taylor
  • Daniel Gianola
چکیده

Imputation of moderate-density genotypes from low-density panels is of increasing interest in genomic selection, because it can markedly reduce genotyping costs. Several imputation software packages have been developed; however, these vary in imputation accuracy and imputed genotypes may be inconsistent over methods. An AdaBoost-like approach was developed to combine imputation results from several independent software packages, i.e., Beagle (v3.3), IMPUTE (v2.0), fastPHASE (v1.4), AlphaImpute, findhap (v2), and Fimpute (v2), with each package serving as a basic classifier in an ensemble-based system. The ensemble method computes weights sequentially for all classifiers, and combines results from component methods via weighted majority “voting” to determine unknown genotypes. The data included 3,078 registered Angus cattle, each genotyped with the Illumina BovineSNP50 BeadChip. SNP genotypes on three chromosomes (BTA1, BTA16, and BTA28) were used to compare imputation accuracy among methods, and our application involved imputation of 50K genotypes covering 29 chromosomes based on a set of 5K genotypes. Beagle and Fimpute had the greatest accuracy, which ranged from 0.8677 to 0.9858. The proposed ensemble method was better than any of these packages, but the sequence of independent classifiers in the voting scheme affected imputation accuracy. The ensemble systems yielding the best imputation accuracies were those that had Beagle as first classifier, followed by one or two methods that utilized pedigree information. A salient feature of our ensemble method is that it can solve imputation inconsistencies among different imputation methods, hence leading to a more reliable system for imputing genotypes relative to use of independent methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genomic heritabilities and genomic estimated breeding values for methane traits in Angus cattle.

Enteric methane emissions from beef cattle are a significant component of total greenhouse gas emissions from agriculture. The variation between beef cattle in methane emissions is partly genetic, whether measured as methane production, methane yield (methane production/DMI), or residual methane production (observed methane production - expected methane production), with heritabilities ranging ...

متن کامل

Effect of Reference Population Size and Imputation Methods on the Accuracy of Imputation in Pure and Mixed Populations

    Imputation as a method of creating low-density chips to high-density chips has been introduced to increase the accuracy of genomic selection in animals. In the current study, to investing imputation accuracy, three populations of mixed (scenario 1), pure (scenario 2) and mixed + pure (scenario 3) were simulated using QMSim. Two methods of imputation including Beagle and Flmpute were used fo...

متن کامل

The Effect of Dams of Sire Path Management on Genetic and Economic Parameters in a Simulated Genomic Selection Program

A deterministic model based on the gene flow method, considering the features of Iranian Holstein cattle population, was implemented in this study to evaluate the effect of altering the number of age-classes in the dams of future sire (DS) path and the number of dams required for breeding a young bull (YB), to be evaluated as future sire, on genetic gain and resultant economic efficiency of a g...

متن کامل

ارزیابی صحت پیش‌بینی ژنومی در معماری‌های مختلف ژنومی صفات کمی و آستانه‌ای با جانهی داده‌های ژنومی شبیه‌سازی‌شده، توسط روش جنگل تصادفی

Genomic selection is a promising challenge for discovering genetic variants influencing quantitative and threshold traits for improving the genetic gain and accuracy of genomic prediction in animal breeding. Since a proportion of genotypes are generally uncalled, therefore, prediction of genomic accuracy requires imputation of missing genotypes. The objectives of this study were (1) to quantify...

متن کامل

Estimation of genotype imputation accuracy using reference populations with varying degrees of relationship and marker density panel

Genotype imputation from low-density to high-density (SNP) chips is an important step before applying genomic selection, because denser chips can provide more reliable genomic predictions. In the current research, the accuracy of genotype imputation from low and moderate-density panels (5K and 50K) to high-density panels in the purebred and crossbred populations was assessed. The simulated popu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012